Mechanistic Machine Learning

MInD Harvard Team Visit

George G. Vega Yon, Ph.D.

2023-05-19

Overview

Slides can be downloaded from
https://ggv.cl/slides/mind2023

Mechanistic Machine Learning

After all the data pouring, attention to causal inference and mechanistic models is coming back1

Mechanistic models

  • Inference-driven (causality).
  • Great for small datasets.
  • Knowledge beyond the observed data.

Machine Learning

  • Data-driven (prediction).
  • Great for big data.
  • Finds hidden knowledge in observed data.

MechML: State-of-the-art

  • Creating a loss function with a mechanistic penalty for modeling tumor cell density (Gaw et al. 2019)

Warning

  1. Mechanistic Machine Learning is not domain-knowledge-aided feature engineering. You need a whole other model to complement the ML algorithm.

  2. This isn’t just an ML ensemble; you must have an ML and a Mech model.

How to “MechML”?

MechML: Three strategies

  • ML Correction: Use machine learning to learn the errors of a mechanistic model.
  • Mechanistic Feature: Add mechanistic predictions as a feature of a machine learning model.
  • Mechanistic Penalty: Add constraints to the ML algorithm based on a mechanistic model.

“A van Gogh-style painting of an android holding a large biology book in one hand and a computer in another, examining an evolutionary tree that, instead of leaves, have genes.”–DALL-E’s interpretation of my description link.

MechML: Work-in-progress

Prediction of Gene Functions
(R01 submitted)

Combine a model of the evolution of gene function with CNNs.

Modeling Decision-Making process
(R21 on the works)

Use a social network diffusion model and ML to predict the prescription of ABX.

Agent-Based Models
(paper)

Use MechML for “automatic” calibration and prediction adjustment

Prediction of Gene Functions1

ABM sandwich1

Prescription of ABX1

Extended Example:
Predicting ABX using Soc. Networks and ML

Goal

  • We want to build a model to predict the prescription of broadspectrum antibiotics [ABX].

  • The goal is to understand what drives prescription.

  • From our experience, peer influence informs the decision-making process.

Social Network Model of Influence

ABM for Policy Design

ML for post-processing

  1. Fit the Mech model.
  1. Generate predictions & corresponding errors.
  1. Have an ML model to learn the errors.
  1. Use the fitted ML to adjust for expected errors.

Mechanistic prediction as a feature

  1. Fit the Mech. model.
  1. Generate the Mech. predictions.
  1. Fit an ML model using the Mech. predictions as features.
  1. Generate your MechML. predictions.

Mechanistic Penalty

  1. Fit the Mech. model.
  1. Save the parameter estimates, \(\hat\theta_{\mbox{Mech}}\).
  1. Include the fitted Mech. model into the loss:

\[ Loss_{\mbox{ML}}(\theta) - \lambda l(\hat{y};\hat\theta_{\mbox{Mech}}) \]

  1. Minimize the loss. Bad predictions (\(\hat{y}\)) under the Mech. model will be penalized.

References

Al taweraqi, Nada, and Ross D. King. 2022. “Improved Prediction of Gene Expression Through Integrating Cell Signalling Models with Machine Learning.” BMC Bioinformatics 23 (1): 323. https://doi.org/10.1186/s12859-022-04787-8.
Baker, Ruth E., Jose-Maria Peña, Jayaratnam Jayamohan, and Antoine Jérusalem. 2018. “Mechanistic Models Versus Machine Learning, a Fight Worth Fighting for the Biological Community?” Biology Letters 14 (5): 20170660. https://doi.org/10.1098/rsbl.2017.0660.
Compagni, Riccardo Delli, Zhao Cheng, Stefania Russo, and Thomas P. Van Boeckel. 2022. “A Hybrid Neural Network-SEIR Model for Forecasting Intensive Care Occupancy in Switzerland During COVID-19 Epidemics.” PLOS ONE 17 (3): e0263789. https://doi.org/10.1371/journal.pone.0263789.
Gaw, Nathan, Andrea Hawkins-Daarud, Leland S. Hu, Hyunsoo Yoon, Lujia Wang, Yanzhe Xu, Pamela R. Jackson, et al. 2019. “Integration of Machine Learning and Mechanistic Models Accurately Predicts Variation in Cell Density of Glioblastoma Using Multiparametric MRI.” Scientific Reports 9 (1): 10063. https://doi.org/10.1038/s41598-019-46296-4.
Jia, Xiaowei, Jared Willard, Anuj Karpatne, Jordan S. Read, Jacob A. Zwart, Michael Steinbach, and Vipin Kumar. 2021. “Physics-Guided Machine Learning for Scientific Discovery: An Application in Simulating Lake Temperature Profiles.” ACM/IMS Transactions on Data Science 2 (3): 1–26. https://doi.org/10.1145/3447814.
Jorner, Kjell, Tore Brinck, Per-Ola Norrby, and David Buttar. 2021. “Machine Learning Meets Mechanistic Modelling for Accurate Prediction of Experimental Activation Energies.” Chemical Science 12 (3): 1163–75. https://doi.org/10.1039/D0SC04896H.
Pearl, Judea. 2019. “The Seven Tools of Causal Inference, with Reflections on Machine Learning.” Communications of the ACM 62 (3): 54–60. https://doi.org/10.1145/3241036.
von Rueden, Laura, Sebastian Mayer, Katharina Beckh, Bogdan Georgiev, Sven Giesselbach, Raoul Heese, Birgit Kirsch, et al. 2023. “Informed Machine Learning A Taxonomy and Survey of Integrating Prior Knowledge into Learning Systems.” IEEE Transactions on Knowledge and Data Engineering 35 (1): 614–33. https://doi.org/10.1109/TKDE.2021.3079836.
Willard, Jared, Xiaowei Jia, Shaoming Xu, Michael Steinbach, and Vipin Kumar. 2022. “Integrating Scientific Knowledge with Machine Learning for Engineering and Environmental Systems.” ACM Computing Surveys, March, 3514228. https://doi.org/10.1145/3514228.
Zampieri, Guido, Supreeta Vijayakumar, Elisabeth Yaneske, and Claudio Angione. 2019. “Machine and Deep Learning Meet Genome-Scale Metabolic Modeling.” PLOS Computational Biology 15 (7): e1007084. https://doi.org/10.1371/journal.pcbi.1007084.
Al taweraqi, Nada, and Ross D. King. 2022. “Improved Prediction of Gene Expression Through Integrating Cell Signalling Models with Machine Learning.” BMC Bioinformatics 23 (1): 323. https://doi.org/10.1186/s12859-022-04787-8.
Baker, Ruth E., Jose-Maria Peña, Jayaratnam Jayamohan, and Antoine Jérusalem. 2018. “Mechanistic Models Versus Machine Learning, a Fight Worth Fighting for the Biological Community?” Biology Letters 14 (5): 20170660. https://doi.org/10.1098/rsbl.2017.0660.
Compagni, Riccardo Delli, Zhao Cheng, Stefania Russo, and Thomas P. Van Boeckel. 2022. “A Hybrid Neural Network-SEIR Model for Forecasting Intensive Care Occupancy in Switzerland During COVID-19 Epidemics.” PLOS ONE 17 (3): e0263789. https://doi.org/10.1371/journal.pone.0263789.
Gaw, Nathan, Andrea Hawkins-Daarud, Leland S. Hu, Hyunsoo Yoon, Lujia Wang, Yanzhe Xu, Pamela R. Jackson, et al. 2019. “Integration of Machine Learning and Mechanistic Models Accurately Predicts Variation in Cell Density of Glioblastoma Using Multiparametric MRI.” Scientific Reports 9 (1): 10063. https://doi.org/10.1038/s41598-019-46296-4.
Jia, Xiaowei, Jared Willard, Anuj Karpatne, Jordan S. Read, Jacob A. Zwart, Michael Steinbach, and Vipin Kumar. 2021. “Physics-Guided Machine Learning for Scientific Discovery: An Application in Simulating Lake Temperature Profiles.” ACM/IMS Transactions on Data Science 2 (3): 1–26. https://doi.org/10.1145/3447814.
Jorner, Kjell, Tore Brinck, Per-Ola Norrby, and David Buttar. 2021. “Machine Learning Meets Mechanistic Modelling for Accurate Prediction of Experimental Activation Energies.” Chemical Science 12 (3): 1163–75. https://doi.org/10.1039/D0SC04896H.
Pearl, Judea. 2019. “The Seven Tools of Causal Inference, with Reflections on Machine Learning.” Communications of the ACM 62 (3): 54–60. https://doi.org/10.1145/3241036.
von Rueden, Laura, Sebastian Mayer, Katharina Beckh, Bogdan Georgiev, Sven Giesselbach, Raoul Heese, Birgit Kirsch, et al. 2023. “Informed Machine Learning A Taxonomy and Survey of Integrating Prior Knowledge into Learning Systems.” IEEE Transactions on Knowledge and Data Engineering 35 (1): 614–33. https://doi.org/10.1109/TKDE.2021.3079836.
Willard, Jared, Xiaowei Jia, Shaoming Xu, Michael Steinbach, and Vipin Kumar. 2022. “Integrating Scientific Knowledge with Machine Learning for Engineering and Environmental Systems.” ACM Computing Surveys, March, 3514228. https://doi.org/10.1145/3514228.
Zampieri, Guido, Supreeta Vijayakumar, Elisabeth Yaneske, and Claudio Angione. 2019. “Machine and Deep Learning Meet Genome-Scale Metabolic Modeling.” PLOS Computational Biology 15 (7): e1007084. https://doi.org/10.1371/journal.pcbi.1007084.